Reducing SMT Rule Table with Monolingual Key Phrase

نویسندگان

  • Zhongjun He
  • Yao Meng
  • Yajuan Lü
  • Hao Yu
  • Qun Liu
چکیده

This paper presents an effective approach to discard most entries of the rule table for statistical machine translation. The rule table is filtered by monolingual key phrases, which are extracted from source text using a technique based on term extraction. Experiments show that 78% of the rule table is reduced without worsening translation performance. In most cases, our approach results in measurable improvements in BLEU score.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Statistical Machine Translation with Monolingual Collocation

This paper proposes to use monolingual collocations to improve Statistical Machine Translation (SMT). We make use of the collocation probabilities, which are estimated from monolingual corpora, in two aspects, namely improving word alignment for various kinds of SMT systems and improving phrase table for phrase-based SMT. The experimental results show that our method improves the performance of...

متن کامل

FUN-NRC: Paraphrase-augmented Phrase-based SMT Systems for NTCIR-10 PatentMT

This paper describes FUN-NRC group’s machine translation systems that participated in the NTCIR-10 PatentMT task. The central motivation of this participation was to clarify the potential of automatically compiled collections of sub-sentential paraphrases. Our systems were built using our baseline phrase-based SMT system by augmenting its phrase table with novel translation pairs generated by c...

متن کامل

Statistical Machine Translation Support Improves Human Adjective Translation

In this paper we present a study in computer-assisted translation, investigating whether nonprofessional translators can profit directly from automatically constructed bilingual phrase pairs. Our support is based on state-of-the-art statistical machine translation (smt), consisting of a phrase table that is generated from large parallel corpora, and a large monolingual language model. In our ex...

متن کامل

LIUM SMT Machine Translation System for WMT 2010

This paper describes the development of French–English and English–French machine translation systems for the 2010 WMT shared task evaluation. These systems were standard phrase-based statistical systems based on the Moses decoder, trained on the provided data only. Most of our efforts were devoted to the choice and extraction of bilingual data used for training. We filtered out some bilingual ...

متن کامل

Chained System: A Linear Combination of Different Types of Statistical Machine Translation Systems

The paper explores a way to learn post-editing fixes of raw MT outputs automatically by combining two different types of statistical machine translation (SMT) systems in a linear fashion. Our proposed system (which we call a chained system) consists of two SMT systems: (i) a syntax-based SMT system and (ii) a phrase-based SMT system (Koehn, 2004). We first translate source sentences of the bite...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009